Combining frame and segment based models for environmental sound classification
نویسندگان
چکیده
The paper considers the task of recognizing environmental sounds, which plays a critical role in human’s perception of an auditory context in audiovisual materials. A variety of features have been proposed for audio recognition, either frame-based or segmental. Here, we propose a two-stage framework to combine modeling in these two levels. First, the Gaussian Mixture Models(GMMs) are built based on short-term features and preclassification are performed. Then, in the event that the GMMs are not certain about the result, the system engages Support Vector Machines (SVMs) to refine the output hypothesis. In the next stage, the features are combined by taking posterior estimates of GMMs along with segmental features as SVMs’ input features. Experiments on the sound dataset show that the proposed framework makes an improvement over the traditional methods.
منابع مشابه
Automatic classification of normal and abnormal cardiac sounds by combining features based on wavelet transform and capstral coefficients extracted from PCG signals (Research Article)
Cardiac sounds are produced by the mechanical activities of the heart and provide useful information about the function of the heart valves. Due to the transient and unstable nature of the heart's sound and the limitation of the human hearing system, it is difficult to categorize heart sound signals based on what is heard from a stethoscope. Therefore, providing an automated algorithm for prima...
متن کاملParallel and hierarchical speech feature classification using frame and segment-based methods
Phonemes in the English language can be represented using either parallel or hierarchical distinctive speech features. There have been a number of efforts to integrate multiple information sources but none of these efforts addressed the issue of combining multiple sets of articulatory/linguistic features with different organization topologies. In this study, we combine a frame-based parallel sp...
متن کاملPROTAX-Sound: A probabilistic framework for automated animal sound identification
Autonomous audio recording is stimulating new field in bioacoustics, with a great promise for conducting cost-effective species surveys. One major current challenge is the lack of reliable classifiers capable of multi-species identification. We present PROTAX-Sound, a statistical framework to perform probabilistic classification of animal sounds. PROTAX-Sound is based on a multinomial regressio...
متن کاملPalarimetric Synthetic Aperture Radar Image Classification using Bag of Visual Words Algorithm
Land cover is defined as the physical material of the surface of the earth, including different vegetation covers, bare soil, water surface, various urban areas, etc. Land cover and its changes are very important and influential on the Earth and life of living organisms, especially human beings. Land cover change monitoring is important for protecting the ecosystem, forests, farmland, open spac...
متن کاملSpectral-spatial classification of hyperspectral images by combining hierarchical and marker-based Minimum Spanning Forest algorithms
Many researches have demonstrated that the spatial information can play an important role in the classification of hyperspectral imagery. This study proposes a modified spectral–spatial classification approach for improving the spectral–spatial classification of hyperspectral images. In the proposed method ten spatial/texture features, using mean, standard deviation, contrast, homogeneity, corr...
متن کامل